Consistent and unbiased cardinality estimation for complex queries with conjuncts of predicates

ABSTRACT

The present invention provides a method of selectivity estimation in which preprocessing steps improve the feasibility and efficiency of the estimation. The preprocessing steps are partitioning (to make iterative scaling estimation terminate in a reasonable time for even large sets of predicates), forced partitioning (to enable partitioning in case there are no “natural” partitions, by finding the subsets of predicates to create partitions that least impact the overall solution); inconsistency resolution (in order to ensure that there always is a correct and feasible solution), and implied zero elimination (to ensure convergence of the iterative scaling computation under all circumstances). All of these preprocessing steps make a maximum entropy method of selectivity estimation produce a correct cardinality model, for any kind of query with conjuncts of predicates. In addition, the preprocessing steps can also be used in conjunction with prior art methods for building a cardinality model.

BACKGROUND OF THE INVENTION

The present invention relates generally to the field of query plan optimization in relational databases and, more specifically, to preprocessing for efficiently optimizing query plans having conjunctive predicates.

When comparing alternative query execution plans (QEPs), a cost-based query optimizer in a relational database management system (RDBMS) needs to estimate the selectivity of conjunctive predicates. The optimizer immediately faces a problem of how to combine available partial information about selectivities in a consistent and comprehensive manner. Estimating the selectivity of predicates has always been a challenging task for a query optimizer in a relational database management system. A classic problem has been the lack of detailed information about the joint frequency distribution of attribute values in the table of interest. Perhaps ironically, the additional information now available to modern optimizers has in a certain sense made the selectivity-estimation problem even harder.

Specifically, consider the problem of estimating the selectivity s_(1,2, . . . ,n) of a conjunctive predicate of the form p₁

p₂

. . .

p_(n), where each p_(i) is a simple predicate (also called a Boolean Factor, or BF) of the form “column op literal”. Here “column” is a column name, “op” is a relational comparison operator such as “=”, “>”, or “LIKE”, and “literal” is a literal in the domain of the column. Some examples of simple predicates are ‘make=“Honda”’ and ‘year >1984’. The selectivity of a predicate p, as known in the art, may be defined as the fraction (or, alternatively, the cardinality referring to the absolute number of satisfying rows) of rows in the table that satisfy p (where p is not restricted to conjunctive form). In typical prior art optimizers, statistics are maintained on each individual column, so that the individual selectivities s₁, s₂, . . . , s_(n) of p₁, p₂, . . . , p_(n) are available. Such a query optimizer would then impose an independence assumption and estimate the desired selectivity as s_(1,2, . . . , n)=s₁*s₂* . . . * s_(n). This type of estimate ignores correlations between attribute values, and consequently can be inaccurate, often underestimating the true selectivity by orders of magnitude and leading to a poor choice of query execution plan (QEP).

To overcome the problems—such as inaccuracy resulting from ignoring correlations—caused by using the independence assumption, the optimizer can store the multidimensional joint frequency distribution for all of the columns in the database. However, in practice, the amount of storage required for the full distribution is exponentially large, making this approach infeasible. Alternative approaches therefore have been proposed for storage of selected multivariate statistics (MVS) that summarize important partial information about the joint distribution. Proposals have ranged from multidimensional histograms on selected columns to other, simpler forms of column-group statistics. Thus, for predicates p₁, p₂, . . . , p_(n), the optimizer typically has access to the individual selectivities s₁, s₂, . . . , s_(n) as well as a limited collection of joint selectivities, such as s_(1,2), s_(3,5), and s_(2,3,4). The independence assumption is then used to “fill in the gaps” in the incomplete information, e.g., to estimate the unknown selectivity s_(1,2,3) by s_(1,2)*s₃.

The problem, alluded to above, of combining available partial information about selectivities in a consistent and comprehensive manner now arises, however. There may be multiple, non-equivalent ways of estimating the selectivity for a given predicate. FIG. 1, for example, shows possible QEPs (a), (b), and (c) for a query consisting of the conjunctive predicate p₁

p₂

p₃. The QEP (a) in FIG. 1 uses an index-ANDing operation (

) to apply p₁

p₂ and afterwards applies predicate p₃ by a FETCH operator, which retrieves rows from a base table according to the row identifiers returned from the index-ANDing operator. The optimizer may know the selectivities s₁, s₂, s₃ of the BFs p₁, p₂, p₃. It may also know about a correlation between p₁ and p₂ via knowledge of the selectivity s_(1,2) of p₁

p₂. Using independence, the optimizer might then estimate the selectivity of p₁

p₂

p₃ as s^(a) _(1,2,3)=s_(1, 2)*s₃.

FIG. 1 shows an alternative QEP (b) that first applies p₁

p₃ and then applies p₂. If the optimizer also knows the selectivity s_(1,3) of p₁

p₃, use of the independence assumption might yield a selectivity estimate s^(b) _(1,2,3)=s_(1,3)*s₂. However, this would result in an inconsistency if, as is likely, s^(a) _(1,2,3)≠s^(b) _(1,2,3). There are potentially other choices, such as s₁*s₂*s₃ or, if s_(2,3) is known, then s_(1,2)*s_(2,3)/s₂ (the latter estimate amounts to a conditional independence assumption). Any choice of estimate will be arbitrary, since there is no supporting knowledge to justify ignoring a correlation or assuming conditional independence. Such a choice will then arbitrarily bias the optimizer towards choosing one plan over the other. Even worse, if the optimizer does not use the same choice of estimate every time that it is required, then different plans will be estimated inconsistently, leading to “apples and oranges” comparisons and unreliable plan choices

Assuming that the QEP (a) in FIG. 1 is the first to be evaluated, a prior art optimizer might avoid the foregoing problem that consistency might not be achieved by recording the fact that s_(1,2) was applied and then avoiding future application of any other MVS that contain either p₁ or p₂, but not both. In the above example, the selectivities for the QEP (c) in FIG. 1 would be used and the ones for QEP (b) would not. The prior art optimizer would therefore compute the selectivity of p_(i)

p₃ to be s₁*s₃ using independence, instead of using the MVS s_(1,3). Thus, the selectivity s_(1,2,3) for QEP (c) would be estimated in a manner consistent with that of QEP (a). In the example illustrated by FIG. 1, when evaluating the QEP (a), the prior art optimizer used the estimate s^(a) _(1,2,3)=s_(1,2)*s₃ rather than s₁*s₂*s₃, because the former estimate better exploits the available correlation information, i.e., the correlation between p₁ and p₂. In general, there may be many possible choices for the estimate of s^(a) _(1,2,3).

Although an ad hoc method as in the example of FIG. 1 may ensure consistency, it ignores valuable knowledge, e.g., the correlation between p₁ and p₃. Moreover, this method complicates the logic of the optimizer, because cumbersome bookkeeping is required to keep track of how an estimate was derived initially and to ensure that it will always be computed in the same way when estimating other plans. Even worse, ignoring the known correlation between p₁ and p₃ also introduces bias towards certain QEPs: if, as is often the case with correlation, s_(1,3)>>s₁*s₃, and s_(1,2)>>s₁*s₂, and if s_(1,2) and s_(1,3) have comparable values, then the optimizer will be biased towards the QEP (c) plan, even though the QEP (a) plan in FIG. 1 might be cheaper, i.e., the optimizer thinks that the QEP (c) will produce fewer rows during index-ANDing, but this might not actually be the case. In general, an optimizer will often be drawn towards those QEPs about which it knows the least, because use of the independence assumption makes these plans seem cheaper due to underestimation. This problem has been dubbed “fleeing from knowledge to ignorance”.

Another significant problem encountered when estimating selectivities in a real-world database management system is that the given selectivities might not be mutually consistent. For example, the selectivities s₁=0.1 and s_(1,2)=0.15 are inconsistent, because they violate the obvious requirement that s_(X)≧s_(Y) whenever X⊂Y. In the presence of inconsistent statistics, it may be impossible to find a set of satisfiable constraints for estimating all the selectivities in a consistent and comprehensive manner.

There are two typical causes of inconsistent statistics. First, the single-column statistics are often taken from the system catalogue directly or derived by the optimizer from catalogue statistics. Because collection of accurate statistics can be a highly cost-intensive process, commercial database systems typically compute catalogue statistics using approximate methods such as random sampling or probabilistic counting. Even when the catalogue statistics are exact, the selectivity estimates computed by the optimizer from these statistics often incorporate inaccurate uniformity assumptions or use rough histogram approximations based on a small number of known quantiles. A second cause of inconsistent knowledge is the fact that different statistics may be collected at different points in time, and the underlying data can change in between collection epochs. This problem is particularly acute in more recent prior art systems where some of the MVS used by the optimizer might be based on query feedback or materialized statistical views.

SUMMARY OF THE INVENTION

In one embodiment of the present invention, a computer-implemented method is given a set of input selectivities {s_(X): XεT}, and the method includes adjusting the input selectivities to obtain a mutually consistent set of selectivities; and minimizing the adjusting of the input selectivities.

In another embodiment of the present invention, a database query optimizer, receiving input of a knowledge set T for a predicate index set N and selectivities {s_(Y): YεT} for a predicate set P, executes steps for: selecting, in response to N not satisfying a partitioning condition, XεT such that X has the smallest impact on a maximum entropy (ME) solution to a constrained optimization problem; and removing the element X from T in order to force a partitioning of P and T.

In a further embodiment of the present invention, a computer program product comprising a computer useable medium includes a computer readable program, wherein the computer readable program when executed on a computer causes the computer to: input a set of input selectivities {s_(X): XεT}; form the |T| constraints Σ_(bεC(X))x_(b)=s_(X), XεT; and detect a zero atom b for which x_(b)=0 in the constraints.

These and other features, aspects and advantages of the present invention will become better understood with reference to the following drawings, description and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates exemplary QEPs (a), (b) and (c) and associated prior art selectivity estimates;

FIG. 2 is a block diagram for a method of preprocessing and estimating selectivities according to an embodiment of the present invention;

FIG. 3 is a Venn diagram illustrating an exemplary probability space given some known conjunctive predicate estimates according to an embodiment of the present invention;

FIG. 4 is a Venn diagram illustrating a maximum entropy solution for the exemplary probability space shown in FIG. 3; and

FIG. 5 is a pseudo-code algorithmic specification for forced partitioning for computation of selectivities according to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

The following detailed description is of the best currently contemplated modes of carrying out the invention. The description is not to be taken in a limiting sense, but is made merely for the purpose of illustrating the general principles of the invention, since the scope of the invention is best defined by the appended claims.

Broadly, embodiments of the present invention provide a novel maximum entropy (ME) method for estimating the selectivity of conjunctive predicates, utilizing an approach that is information-theoretically sound, i.e., valid from the point of view of information theory, invented by Claude Shannon of MIT circa 1947, as known in the art, and that takes into account available statistics on both single columns and groups of columns. Embodiments of the invention find application, for example, in commercial query optimizers that use statistical information on the number of rows in a relational database management system table and the number of distinct values in a column to compute the selectivities of simple predicates.

Unlike prior art ad hoc query optimizers—such as the example given above—that select from available knowledge and thus remain subject to inconsistencies and to introducing biases, embodiments avoid arbitrary biases, inconsistencies, and the flight from knowledge to ignorance by deriving missing knowledge using the ME principle. Embodiments exploit any and all available multi-attribute information, in contrast to prior art selectivity models that typically ignore at least some available information.

Embodiments also differ from the prior art by introducing methods for resolving inconsistencies in the available multivariate statistics (MVS) that would otherwise prevent computation of a ME solution; such inconsistencies can arise when the single-column statistics in the catalogue have been computed only approximately, or when the various statistics used by the optimizer have been computed at different points in time.

Other embodiments differ from the prior art by including novel partitioning methods that permit application of the inventive method of selectivity estimation (and also prior art methods of selectivity estimation) to complex queries with many predicates. For example, in various embodiments, the efficiency of the ME computation can be improved—often by orders of magnitude—by partitioning the predicates into disjoint sets and computing an ME distribution for each of the resulting sub-problems. Thus, estimating selectivities for a complex query, e.g., one having more than 10 predicates, may be feasible with the present invention where not practical using prior art methods.

Further embodiments differ from the prior art by providing methods for detecting and eliminating atoms of the predicates with implied selectivity of zero (also referred to as “zero atoms” or “zeroes”) prior to the computation of the selectivities of the predicates. Implied zero elimination may ensure convergence of an iterative scaling algorithm used in some embodiments to compute selectivies using the ME model. As with the novel methods for resolving inconsistencies and for partitioning, the novel method for implied zero elimination may also be applied to prior art methods for estimating selectivities.

The problem of selectivity estimation for conjunctive predicates, given partial MVS, may be formalized using the following terminology. A set of boolean factors (BFs) may be denoted as P={p₁, . . . , p_(n)}. For any X⊂N={1, . . . , n}, p_(X) is used to denote the conjunctive predicate

_(iεX)p_(i). The symbol s denotes a probability measure over 2^(N), the powerset of N, with the interpretation that s_(X) is the selectivity of the predicate p_(X). (The quantity s_(X) can also be interpreted as the probability that a randomly selected row satisfies p_(X).) Usually, for |X|=1, the histograms and column statistics from the system catalog determine s_(X) and are all known. For |X|>1, the MVS may be stored in the database system catalog either as multidimensional histograms, index statistics, or some other form of column-group statistics or statistics on intermediate tables. In practice, s_(X) is not known for all possible predicate combinations due to the exponential number of combinations of columns that can be used to define MVS. Suppose that s_(X) is known for every X in some collection T⊂2^(N). The collection T may be referred to as the “knowledge set” for the predicates P. It may be defined that the empty set Ø is always part of T, because s_(Ø)=1 when applying no predicates. Given that s_(X) is known for every X in the collection T, the selectivity estimation problem is to compute s_(X) for Xε2^(N)\T, i.e., to compute .s_(X) for all the remaining sets X not in the collection T.

Information theory defines for a probability distribution q=(q₁, q₂, . . . ) a measure of uncertainty called entropy: H(q)=−Σ_(i) q _(i) log q _(i). The ME principle prescribes selection of the unique probability distribution that maximizes the entropy function H(q) and is consistent with respect to the known information. Entropy maximization without any additional information uses the single constraint that the sum of all probabilities equals 1. The ME probability distribution is then the uniform distribution. When constraints only involve marginals of a multivariate distribution, the ME solution coincides with the independence assumption. Thus, query optimizers that do not use MVS comprise a special case of estimating their selectivities for conjunctive queries according to the ME principle: they assume uniformity when no information about column distributions is available, and they assume independence because they do not know about any correlations. In contrast, integrating the more general ME principle into the inventive optimizer's cardinality model, thereby generalizes these concepts of uniformity and independence. The ME principle enables the inventive optimizer to take advantage of all available information in a consistent way, avoiding inappropriate bias towards any given set of selectivity estimates. The ME principle applied to selectivity estimation means that, given several selectivities of simple predicates and conjuncts, the inventive optimizer chooses the most uniform and independent selectivity model consistent with this information.

Referring now to FIG. 2, method 200 for estimating selectivities for a query may begin with an input 202 of a query with predicates indexed by the set N, for example, P={p₁, . . . , p_(n)} and N={1, . . . , n}. N may be called the index set of P. A brief description of method 200 is given here, with more detailed descriptions of partitioning (206), resolving inconsistencies (210), and implied zero detection and elimination (212) steps of method 200 given further below.

At step 204, method 200 may retrieve single, e.g., individual selectivities s₁, s₂, . . . s_(n), and multivariate, e.g., joint selectivities, such as s_(1,2), s_(3,5), and s_(2,3,4), statistics T from the database, where T is a collection T⊂2^(N), as formalized above.

At step 206, if there is a need to reduce computational complexity of estimating selectivities, the overall problem of estimating selectivities may be divided into several sub-problems 207 as indicated by the branching of flow arrows at 208. To accomplish a division of the overall problem into subproblems, method 200 may partition N with respect to T. The overall problem may then be solved by combining solutions for the sub-problems 207 as indicated by the rejoining of flow arrows at 209. If T satisfies a certain condition (described below), method 200 may accordingly partition N with respect to T (“naturally”) at step 206 and, if not, method 200 may use forced partitioning to partition N with respect to T at step 206. Both types of partitioning, which may occur at step 206, are described in more detail below. Each of the sub-problems 207 corresponds to a component N_(k) of the partition of N. (If N is not partitioned, there is only one component N₁=N and one sub-problem 207.)

At step 210, method 200 may resolve inconsistencies for each of the sub-problems 207 as described in more detail below.

At step 212, method 200 may detect and eliminate zero atoms as described in more detail below.

At step 214, method 200 may compute an ME solution of selectivity estimates for each component of the partition of N with respect to T.

At step 216, the solutions for the partitions may be combined, as described in more detail below in connection with partitioning.

At output 218, an overall solution for the problem of estimating selectivities may be given in accordance with the ME selectivity model.

In order to describe in more detail the partitioning (206), resolving inconsistencies (210), and implied zero detection and elimination (212) steps of method 200, the following begins with a description of the constrained optimization problem that may be solved at step 214 in order for method 200 to provide selectivity estimation at step 214. The constrained optimization problem is described first in order to provide a consistent terminology for describing the partitioning (206), resolving inconsistencies (210), and implied zero detection and elimination (212) steps.

Rather than applying the steps of resolving inconsistencies (210) and detecting zero atoms (212) directly to a selectivity estimation problem it may be more computationally efficient to first break up the selectivity estimation problem into smaller problems before applying the steps of resolving inconsistencies (210) and detecting zero atoms (212) individually to each smaller problem. Thus, method 200 may first perform the partitioning (206) process which accomplishes this breaking up of a problem into smaller sub-problems.

Likewise, because a newly consistent set of constraints may have new zero atoms, method 200 may first obtain a consistent starting set of constraints by applying the process of resolving inconsistencies (210) to the (sub)problem before detecting and removing zero atoms (212). Also, because the steps of resolving inconsistencies (210) and detecting zero atoms (212) may facilitate a more efficient computation of solutions to the constrained optimization (sub)problems for selectivity estimation, method 200 may first perform all the preprocessing steps of partitioning (206), resolving inconsistencies (210), and detecting zero atoms (212) before computing an ME solution for the constrained optimization problem. Thus, while the sections below are presented in a certain order to facilitate a logical and consistent exposition, that order of presentation of the processing steps of method 200 may be the reverse of the exemplary order of execution in method 200. Thus, each heading below is cross-referenced to a corresponding step of method 200.

The Constrained Optimization Problem (Step 214)

Given a set of predicates P={p₁, p₂, . . . , p_(n)}, each of the corresponding atoms—terms in disjunctive normal form (DNF), i.e., a term of the form

_(iεN) p_(i) ^(b) ^(i) for b_(i)ε{0,1}—may be denoted in abbreviated form by a binary string of length n. For example, when n=3, the string b=100 denotes the atom p₁

p₂

p₃, and so forth. As above, N={1,2, . . . ,n} and 2^(N) is used to denote the set of all subsets of N. For each predicate p^(X), Xε2^(N), C(X) is used to denote the set of components of X, i.e., the set of all atoms contributing to p_(X). Formally, C(X)={bε={0,1}^(n) |b _(i)=1 for all iεX} and C(Ø)={0,1} ^(n). For example, for predicates p₁ and p_(1,2), the components are: C({1})={100, 110, 101, 111} and C({1,2})={110, 111}. Additionally, for each possible knowledge set T⊂2^(N), P(b, T) denotes the set of all XεT such that p_(X) has b as an atom in its DNF representation, i.e., P(b,T)={XεT|b _(i)=1 for all iεX}∪{Ø}. Thus, for the atom 011 and T=2^({1,2,3}), for example, P(b,T)={{2}, {3}, {2,3}, Ø}.

Given s_(X) for XεT, s_(X) for X∉T may be computed according to the ME principle. To this end, method 200, for example, at step 214, may solve the following constrained optimization problem:

$\begin{matrix} {\underset{x_{b}|{b \in {\{{0,1}\}}^{n}}}{minimize}{\sum\limits_{b \in {\{{0,1}\}}^{n}}\;{x_{b}\log\; x_{b}}}} & \left( {{Equation}\mspace{20mu} 1} \right) \end{matrix}$ subject to the |T| constraints Σ_(bεC(X)) x _(b) =s _(X) , XεT,  (Equation 2) where x_(b)ε[0,1] denotes the selectivity of atom b. The |T| constraints correspond to the known selectivities. One of the included constraints is s_(Ø)=Σ_(bε{0,1}) _(n) x_(b)=1, which asserts that the combined selectivity of all atoms is 1. The solution is a probability distribution with the maximum value of uncertainty (entropy), subject to the constraints. Given this solution, method 200 can compute any arbitrary selectivity s_(X) as s_(X)=Σ_(bεC(X))x_(b). The above problem can be solved analytically only in simple cases with a small number of unknowns. In general, a numerical solution method is required to solve the constrained optimization problem—such as an iterative scaling computation or a Newton-Raphson method.

EXAMPLE

FIG. 3 shows the probability space for the case N={1,2,3}, T={{1}, {2}, {3}, {1,2}, {1,3}, Ø}, and selectivities s₁=0.1, s₂=0.2, s₃=0.25, s₁₂=0.05, s₁₃=0.03, and s_(Ø)=1.

The example of FIG. 3 implies the following six constraints: s ₁ =x ₁₀₀ +x ₁₁₀ +x ₁₀₁ +x ₁₁₁=0.1  (I) s ₂ =x ₀₁₀ +x ₀₁₁ +x ₁₁₀ +x ₁₁₁=0.2  (II) s ₃ =x ₀₀₁ +x ₀₁₁ +x ₁₀₁ +x ₁₁₁=0.25  (III) s _(1,2) =x ₁₁₀ +x ₁₁₁=0.05  (IV) s _(1,3) =x ₁₀₁ +x ₁₁₁=0.03  (V) s_(Ø)=Σ_(bε{0,1}) ₃ x_(b)=1  (VI) The task of selectivity estimation is to now compute a solution for all atoms x_(b), bε{0,1}³ that maximizes the entropy function −Σ_(bε{0,1}) ₃ x_(b) log x_(b) and satisfies the above six constraints. Any desired selectivity s_(X) can then be computed from the x_(b) values as indicated previously.

FIG. 4 gives the results obtained when solving this constrained optimization problem. For instance, in this ME solution, method 200 may obtain the selectivity estimate s_(1,2,3)=x₁₁₁=0.015 and s_(2,3)=x₁₁₁+x₀₁₁=0.05167.

Detection of Implied Zero Atoms (Step 212)

It is often the case that the constraints in Equation 2 imply that the selectivity of certain atoms must be zero in any feasible solution. For instance, if p

p₂, so that s₁=s_(1,2), then x₁₀₀=x₁₀₁=0 in any solution x. These zero atoms, i.e., atoms for which the selectivity=0, can destabilize numerical solution algorithms so that they do not converge to a solution of the ME optimization problem. Consequently, all zero atoms must be identified and explicitly removed from the constraints in the ME optimization problem prior to execution of numerical solution algorithms.

Identifying zero atoms is nontrivial in general, because the reasoning involved can be arbitrarily complex. In the following sections, an iterative sub-method 212 a and an approximation sub-method 212 b are described for automatically detecting zero atoms, either of which may be used by method 200, for example, at step 212. The iterative sub-method 212 a provides an exact solution but is computationally expensive while the approximation sub-method 212 b is relatively quick (computationally inexpensive) but approximate.

Iterative Detection of Zero Atoms (Sub-method 212 a)

The iterative sub-method 212 a for detecting zero atoms is based on the following. If, for a given atom b, there exists a feasible solution x that satisfies all constraints in Equation 2 and in which x_(b)>0, then x_(b) is also positive in the ME solution.

The iterative sub-method 212 a for zero detection may begin with an initial set A₀ of candidate zero atoms that contains all of the atoms: A₀={0,1}^(n). In the first iteration the iterative sub-method 212 a may set i=0 and solve the linear program (LP):

$\begin{matrix} {{{\underset{x_{b}|{b \in {\{{0,1}\}}^{n}}}{maximize}{\sum\limits_{b \in A_{i}}\;{x_{b}\mspace{14mu}{subject}\mspace{14mu}{to}\mspace{14mu}{Equation}\mspace{14mu} 2\mspace{14mu}{and}\mspace{14mu}{to}\mspace{14mu} x_{b}}}} \in \left\lbrack {0,1} \right\rbrack},{b \in \left\{ {0,1} \right\}^{n}}} & \left( {{Equation}\mspace{20mu} 3} \right) \end{matrix}$ using, e.g., the simplex method (as known in the art). The idea is that a solution to the above problem can make each x_(b) as large as possible while satisfying the feasibility conditions. Any x_(b) that is equal to 0 in the LP solution is therefore likely to be a zero atom. However, an x_(b) that is equal to 0 in the LP solution is not guaranteed to be a zero atom; it could be the case that x_(b)=0 in the computed optimal LP solution but not in other possible optimal LP solutions. The iterative sub-method 212 a may therefore refine the set of candidates as A₁=A₀\{b|x_(b)≠0}, set i=1, and solve the resulting LP in Equation 3. The iterative sub-method 212 a may proceed in this manner, iterating until either (i) A_(i)=Ø or (ii) x_(b)=0 for every bεA_(i). In the former case, each candidate atom can have been shown to have positive probability in at least one feasible solution, and hence, as discussed previously, should be positive in the ME solution, and, therefore the iterative sub-method 212 a may conclude that there are no zero atoms. In the latter case, the objective-function value for the solution to the LP in Equation 3 is 0, and therefore each atom bεA_(i) should have x_(b)=0 in any feasible solution—otherwise, the optimal objective-function value would have been positive—and hence each atom bεA_(i) should have had x_(b)=0 also in the ME solution. Therefore, the iterative sub-method 212 a may conclude that A_(i) is precisely the set of zero atoms. Example: Iterative Zero Detection (Sub-method 212 a)

Suppose that N={1,2,3}, T={{1}, {2}, {3}, {1,2}, {1,3}, {2,3}, Ø}, s₁=0.23, s₂=0.01, s₃=0.015, s_(1,2)=0.01, s_(1,3)=0.01, s_(2,3)=0.01, and s_(Ø)=1. The constraints in the ME optimization problem are: s ₁ =x ₁₀₀ +x ₁₁₀ +x ₁₀₁ +x ₁₁₁=0.23 s ₂ =x ₀₁₀ +x ₀₁₁ +x ₁₁₀ +x ₁₁₁=0.01 s ₃ =x ₀₀₁ +x ₀₁₁ +x ₁₀₁ +x ₁₁₁=0.015 s _(1,2) =x ₁₁₀ +x ₁₁₁=0.01 s _(1,3) =x ₁₀₁ +x ₁₁₁=0.01 s _(2,3) =x ₀₁₁ +x ₁₁₁=0.01 s_(Ø)=Σ_(bε{0,1}) ₃ x_(b)=1

Setting A₀={0,1}³ and solving the LP in Equation 3, the iterative sub-method 212 a obtains a solution in which x₀₀₀, x₁₀₀, x₀₀₁, and x₁₁₁, are nonzero. For the next iteration the iterative sub-method 212 a therefore sets A₁={010, 110, 101, 011}. Solving the resulting LP, the iterative sub-method 212 a finds that x₀₁₀=x₁₁₀=x₁₀₁, =x₀₁₁=0, so that A₂ is the set of zero atoms.

The foregoing iterative sub-method 212 a can be shown to discover every zero atom and not misclassify any nonzero atoms as zero atoms. Unfortunately, due to its iterative nature, the sub-method 212 a may be so computationally expensive as to be impractical in real-world applications, even with a highly sophisticated LP solver.

Zero Detection Via Approximation (Sub-method 212 B)

A more practical implementation of method 200 may employ an approximation detection sub-method 212 b, for example, at step 212 of method 200, that may offer a reasonable trade-off between accuracy and execution time. The approximation detection sub-method 212 b may first rewrite the selectivity of each atom as the sum of two new variables: x_(b)=v_(b)+w_(b). The approximation detection sub-method 212 b may now solve the following LP:

$\underset{v_{b},w_{b}}{maximize}{\sum\limits_{b \in {\{{0,1}\}}^{n}}\;{v_{b}\mspace{14mu}{subject}\mspace{14mu}{to}}}$ ${{{\sum\limits_{b \in {C{(X)}}}\; v_{b}} + w_{b}} = s_{X}},{X \in T}$ 0 ≤ w_(b) ≤ 1  and  0 ≤ v_(b) ≤ ɛ, b ∈ {0, 1}^(n) where ε is a small value. For example, ε=0.0001 may be selected, and the value chosen may depend on various factors such as execution time for solving the LP and the level of precision desired. After solving this LP, an atom b may be considered to be a zero atom if and only if w_(b)=v_(b)=0. The idea is that setting x_(b)=0 requires setting v_(b)=0, which can significantly decrease the objective-function value because of the ε upper bound on all of the selectivities. Thus only “true” zero atoms are likely to be identified by the solution to the LP. The approximation detection sub-method 212 b may include the w_(b) variables because they provide the “padding” needed to ensure that the original constraints in Equation 2 are satisfied. The approximation detection sub-method 212 b can find all of the zero atoms, but may be considered approximate in that it can mislabel some nonzero atoms as zero atoms. In practice, such mislabelings tend to be infrequent, so that the quality of the ultimate ME solution (e.g., given by method 200) remains good. Example: Zero Detection Via Approximation (Sub-method 212 B)

As in the previous example for sub-method 212 a, again suppose that N={1,2,3}, T={{1}, {2}, {3}, {1,2}, {1,3}, {2,3}, Ø}, s₁=0.23, s₂=0.01, s₃=0.015, s_(1,2)=0.01, s_(1,3)=0.01, s_(2,3)=0.01, and s_(Ø)=1. The constraints in the resulting LP are: s ₁ =v ₁₀₀ +w ₁₀₀ +v ₁₁₀ +w ₁₁₀ +v ₁₀₁ +w ₁₀₁ +v ₁₁₁ +w ₁₁₁=0.23 s ₂ =v ₀₁₀ +w ₀₁₀ +v ₀₁₁ +w ₀₁₁ +v ₁₁₀ +w ₁₁₀ +v ₁₁₁ +w ₁₁₁=0.01 s ₃ =v ₀₀₁ +w ₀₀₁ +v ₀₁₁ +w ₀₁₁ +v ₁₀₁ +w ₁₀₁ +v ₁₁₁ +w ₁₁₁=0.015 s _(1,2) =v ₁₁₀ +w ₁₁₀ +v ₁₁₁ +w ₁₁₁=0.01 s _(1,3) =v ₁₀₁ +w ₁₀₁ +v ₁₁₁ +w ₁₁₁=0.01 s _(2,3) =v ₀₁₁ +w ₀₁₁ +v ₁₁₁ +w ₁₁₁=0.01 s _(Ø)=Σ_(bε{0,1}) ₃ v_(b)+w_(b)=1

In the solution returned by the simplex algorithm, the following variables are equal to zero: v₀₁₀, w₀₁₀, v₁₁₀, w₁₁₀, v₁₀₁, w₁₀₁, v₀₁₁, w₀₁₁. The approximation detection sub-method 212 b therefore can take the set of zero atoms as {010, 110, 101, 011}. Observe that this solution for this example coincides with the solution for the previous example returned by the iterative detection sub-method 212 a

Resolving Inconsistencies (Step 210)

A significant problem encountered when applying ME selectivity estimation method 200 in a real-world database management system is that the given selectivities {s_(X): XεT} might not be mutually consistent. For example, the selectivities s₁=0.1 and s_(1,2)=0.15 are inconsistent, because they violate the obvious requirement that s_(X)≧s_(Y) whenever X⊂Y. In the presence of inconsistent statistics, there can not exist any solutions to the constrained optimization problem in Equations 1 and 2, much less an ME solution, and the iterative scaling algorithm, for example, if applied blindly, will fail to converge. Therefore, method 200 provides at step 210 a method for resolving inconsistencies that may adjust the input selectivities (obtained, e.g., at step 204) to obtain a set of satisfiable constraints, prior to solution of the constrained optimization problem, (e.g., at step 214) by execution, for example, of the iterative-scaling method.

The method for resolving inconsistencies first may associate two “slack” variables a_(X) ⁺; a_(X) ⁻ with each of the original constraints in Equation 2, except for the constraint corresponding to s_(Ø). This latter constraint ensures that the atom selectivities sum to 1, and therefore may not be modified. The method for resolving inconsistencies (step 210) then may solve the following LP:

$\begin{matrix} {{{\underset{{a_{X}}^{+},{a_{X}}^{-},x_{b}}{minimize}{\sum\limits_{X \in {T\backslash\varnothing}}\;{a_{X}}^{+}}} + {{a_{X}}^{-}\mspace{14mu}{subject}\mspace{14mu}{to}}}{{{{\sum\limits_{b \in {C{(X)}}}\; x_{b}} + {a_{X}}^{+} - {a_{X}}^{-}} = s_{X}},{X \in {T\backslash\left\{ Ø \right\}}},{{\sum\limits_{b \in {\{{0,1}\}}^{n}}\; x_{b}} = {{1{a_{X}}^{+}} \geq 0}},{{a_{X}}^{-} \geq {0\mspace{20mu}{and}\mspace{20mu} 0} \leq {s_{X} - {a_{X}}^{+} + {a_{X}}^{-}} \leq 1}}} & \left( {{Equation}\mspace{20mu} 4} \right) \end{matrix}$

The slack variables may represent either positive or negative adjustments to the selectivities needed to ensure the existence of a feasible solution. In this connection, it may be observed that, in the optimal solution to the LP, at most one of the two slack variables for a constraint can be nonzero; indeed, for a specified value a_(X) ⁺-a_(X) ⁻ of the total adjustment, any solution that has a nonzero value for both slack variables yields a higher value of the objective function than a solution with only a single nonzero slack variable. The presence of a nonzero slack variable can both signal the presence of an inconsistency and indicate how to obtain consistency, namely, by setting s_(X)*:=s_(X)−a_(X) ⁺+a_(X) ⁻. The constraint in Equation 4 ensures that the adjusted selectivities lie in the range [0,1] By taking the objective function as the sum of the slack variables, the method for resolving inconsistencies (step 210) can ensure that the adjustments to the constraints are as small as possible.

In an alternative embodiment of step 210, the terms in the objective function may be weighted to reflect the fact that, typically, some statistics are more reliable than others. A large weighting coefficient for the slack variables a_(X) ⁺ and a_(X) ⁻ could be used for a corresponding selectivity s_(X) that is relatively more reliable; thus, this selectivity is relatively unlikely to be adjusted because an adjustment would incur a relatively large penalty in the objective function. By means of this weighting method, unreliable statistics are more likely to be subject to adjustments than reliable ones.

A newly consistent set of constraints may have zero atoms. Method 200, thus, first may obtain a consistent starting set of constraints at step 210 using the methods described here and then detect and remove zero atoms at step 212 as described above under the headings relating to “ZERO DETECTION”.

Example: Inconsistency Detection and Removal (Step 210)

Suppose that N={1,2}, T={{1}, {2}, {1,2}, Ø}, s₁=0.99, s₂=0.99, s_(1,2)=0.90, and s_(Ø)=1. The constraints for the LP are given by: s ₁ =x ₁₀ +x ₁₁ +a ₁ ⁺ −a ₁ ⁻=0.99 s ₂ =x ₀₁ +x ₁₁ +a ₂ ⁺ −a ₂ ⁻=0.99 s _(1,2) =x ₁₁ +a _(1,2) ⁺ −a _(1,2) ⁻=0.90 s _(Ø) =x ₀₀ +x ₁₀ +x ₀₁ +x ₁₁=1

Minimizing the sum over all slack variables yields the following solution: a₁ ⁺=a₁ ⁻=0 a₂ ⁺=a₂ ⁻=0 a_(1,2) ⁺=0 a_(1,2) ⁻=0.08

Although it may not be readily apparent, the constraint set T contains an inconsistency, because a_(1,2) ⁻=0.08. Applying the resulting adjustment, the method for resolving inconsistencies (step 210) may obtain the set of consistent selectivities s₁*=0.99, s₂*=0.99, and s_(1,2)*=0.98.

Partitioning (Step 206)

Because solving the constrained optimization problem (e.g., at step 214), for example, using iterative scaling, has computational complexity that is exponential on the size of the predicate set P, e.g., complexity of O(|T|2^(|P|)), it may be desirable to avoid executing the iterative scaling algorithm on the full predicate set P. Instead, method 200 may compute the ME solution by partitioning P into several disjoint subsets, executing the iterative scaling computation on each subset independently, and using the independence assumption to combine selectivity estimates for predicates in different partitions. Partitioning can reduce the computational complexity from O(|T|2^(|P|)) to O(|T₁|2^(|P) ₁ ^(|)+|T₂|2^(|P) ₂ ^(|)+ . . . +|T_(k)|2^(|P) _(k) ^(|)), where T₁, T₂, . . . , T_(k) and P₁, P₂, . . . , P_(k) form a partition of the predicate set P and can make the iterative scaling algorithm feasible even for extremely complex queries with large sets of predicates.

Method 200 can naturally partition the predicate set P under certain conditions, e.g. the partitioning condition described below; otherwise, method 200 may force a partition of the predicate set P or may force a further partition of the natural partition if there is a need to make the partition sizes smaller.

If, at step 206, method 200 can split N={1,2, . . . ,n} into nonempty disjoint subsets N₁, N₂, . . . , N_(k), such that for each XεT it is the case that x⊂N_(i) for some iε{1,2, . . . ,k}, then the index set N of P is said to satisfy a partitioning condition. In that case, method 200 may partition P and T accordingly (or naturally partition P and T) by setting P _(i) ={p _(j) |jεN _(i)} and T _(i) ={XεT|X⊂N _(i)} for 1≦i≦k. (It may be observed that the T_(i) sets are not completely disjoint, because they all contain Ø.) In method 200, the ME solution for (P,T) can be obtained by first (e.g., using iterative scaling) computing the ME solutions for (P₁,T₁), (P₂,T₂), . . . , (P_(k),T_(k)) and then using the independence assumption to combine selectivity estimates for predicates in different partitions; i.e., method 200 may compute s _(x)=Π_(iε{1,2, . . . ,k}) s _(x∩N) ^(i) for Xε2^(N).

For example, when N={1,2,3,4} and T={{1}, {1,2}, {3}, {3,4}, Ø}, method 200 can take N₁={1,2}, N₂={3,4}, T₁={{1}, {1,2}, Ø}, and T₂={{3}, {3,4}, Ø} Method 200 can then, for example, compute s₂ by obtaining the ME solution for (P₁,T₁), compute s₄ by obtaining the ME solution for (P₂,T₂), and then compute s_(2,4) using the independence assumption as s_(2,4)=s₂*s₄.

Partitioning can reduce the iterative scaling computation complexity from O(|T|2^(|N|)) to O(T₁2^(|N) ^(1|) + . . . +T_(k)2^(|N) ^(k) ^(|)), making iterative scaling feasible even for large sets of predicates N. For example, when N={1,2, . . . ,12}, |T|=25, k=4, |N_(i)|=3 and |T_(i)|==7, the use of partitioning can reduce the complexity by two orders of magnitude.

In practice, partitioning can reduce the computational complexity even more than is indicated above, by avoiding execution of the iterative scaling computation altogether for specified partitions (P_(i),T_(i)). In the above example, for instance, method 200 can compute the selectivity s_(1,2,4) as s_(1,2,4)=s_(1,2)*s₄; because s_(1,2) is known a priori, method 200 need only run iterative scaling on (P₂,T₂) and not on (P₁,T₁). Similarly, if a partition contains only a few predicates, it may be possible to compute the desired ME selectivity analytically, without requiring iterative scaling. For example, suppose that P_(i)={p₁, p₂, p₃}, T_(i)={{1}, {2}, {3}, {1,2}, {2,3}, Ø}, and method 200 needs to compute s_(1,2,3). It can be shown, for this example, that s_(1,2,3)=s_(1,2)*s_(2,3)/s₂, so that no iterative scaling computation is needed. In general, it may be possible to maintain a “library” of such identities, and method 200 may perform a quick syntactic analysis at selectivity-estimation time to see if any of the identities apply.

Forced Partitioning (step 206)

In order to provide adequate real-time performance for computation of the constrained optimization problem (e.g., at step 214) using iterative scaling, for example, it may be desirable to ensure that each partition N_(i) of N, as described above, has a cardinality smaller than μ, where μ is a constant natural number greater than or equal to one that depends on the computer hardware and system load. For example, the constant μ may depend primarily on the CPU speed of the computer. On a single-user laptop using an Intel Pentium® III Mobile CPU 1133 MHz with 512 MB RAM, for example, a value for μ=8 has been found to provide computation time typically less than one second for iterative scaling computations.

It may not always be possible to partition N into subsets of cardinality less than μ for a given set T. For instance, if N={1,2,3,4,5,6} and T={{1}, {2}, {3}, {4}, {5}, {6}, {1,2}, {2,3}, {3,4}, {4,5}, {5,6}}, then N cannot be partitioned for μ=3 because for any partitioning of N into N₁ and N₂, there exists at least one XεT such that X∩N₁≠Ø and X∩N₂≠Ø. Thus, the independence assumption applied between any two partitions N₁ and N₂ in this example cannot result in the correct ME solution for N and T.

In such cases, method 200 may remove elements from T in order to force a partitioning in which |N_(i)|≦μ. However, this forced partitioning can impact the quality of the ME solution, since method 200 is discarding information. Ideally, a forced-partitioning method should remove those elements that have the least impact on the ME solution. Such a method, however, would be very computationally expensive, because it would have to consider the interaction of every constraint with every other constraint. Method 200 may use a pragmatic computational method (e.g., at step 206) for forced partitioning with complexity O(|T|log|T|) that ignores interactions and greedily removes elements from T with the goal of minimizing the impact on the ME solution, described as follows.

When generating partitions, method 200 may keep track of the cardinality c_(i) of each partition N_(i). At the beginning, method 200 may start with |N| partitions, N_(i)={i} and c_(i)=1 for every partition. Method 200 then may iteratively merge partitions based on each element XεT, i. e,

$N_{i} = \left\{ \begin{matrix} {\bigcup_{j|{X \subseteq N_{j}}}N_{j}} & {{{{if}\mspace{14mu} i} = {\min(X)}};} \\ Ø & {{otherwise},} \end{matrix} \right.$ which reduces the number of partitions, but increases c_(i). Method 200 may only add an element XεT to the knowledge set T_(i) of partition N_(i) if afterwards c_(i)≦μ is still satisfied. If c_(i)≦μ is violated, method 200 may ignore X.

In order to have as little impact as possible on the overall ME solution, method 200 may avoid discarding those elements XεT that correspond to knowledge about the largest deviations from independence. Method 200 can achieve this goal by processing elements in descending order of Δ_(X), where Δ_(X)=s_(x)/Π_(iεX)s_(i). The quantity Δ_(X) measures the degree to which the corresponding s_(X) constraint in Equation 2 forces the ME solution away from independence. FIG. 5 summarizes the computation for forced partitioning. An efficient implementation could store T as a list sorted according to Δ_(X) and thus have a complexity of O(|T|log|T|).

Example: Forced Partitioning (Step 206)

Let N={1,2,3,4,5,6} and T={{1}, {2}, {3}, {4}, {5}, {6}, {1,2}, {2,3}, {3,4}, {4,5}, {5,6}}. Also suppose that s ₁=0.1, s ₂=0.2, s ₃=0.3, s ₄=0.4, s ₅=0.5, s ₆=0.6 and s _(1,2)=0.1, s _(2,3)=0.2, s _(3,4)=0.3, s _(4,5)=0.4, s _(5,6)=0.5.

It follows that Δ_(1,2)=5, Δ_(2,3)=3.33, Δ_(3,4)=2.5, Δ_(4,5)=2, and Δ_(5,6)=1.67. With μ=3, the forced partitioning method (step 206) therefore obtains k=2, N₁={1,2,3}, T₁={{1}, {2}, {3}, {1,2}, {2,3}}, N₂={4,5,6}, and T₂={{4}, {5}, {6}, {4,5}, {5,6}}. The element {3,4} has been dropped, to enable the partitioning of N into N₁ and N₂. Of all the elements that prevent partitions of size 3 from being constructed, the element {3,4} may have the smallest impact on the ME solution.

It should be understood, of course, that the foregoing relates to exemplary embodiments of the invention and that modifications may be made without departing from the spirit and scope of the invention as set forth in the following claims. 

We claim:
 1. A computer-implemented method for a database, given a set of input selectivities {Sx: XεT}, wherein T represents a knowledge set, the method comprising: partitioning a predicate index set N with respect to said input selectivities, wherein a number of said partitions is at least one, wherein the input selectivities can be single or multivariate; and wherein a constant μ: is chosen such that the cardinality of each partition N₁, N₂, . . . , N_(k) follows |N_(i)|≦μ for each iε{1,2, . . . , k}; adjusting said input selectivities to obtain a mutually consistent set of selectivities, wherein each partition set of selectivities is adjusted separately and treated as a separate unit; wherein said adjusting step comprises associating two slack variables with each constraint of said constrained optimization problem, except for the constraint corresponding to an S_(φ) minimizing said adjusting of said input selectivities; wherein said minimizing step comprises minimizing a total sum of said slack variables; and combining the selectivity estimates in the separate partitions using an independence assumption, S_(x)=Π_(iε{1,2, . . . , k})S_(x∩N) ₁ for Xε2^(n); wherein said given set of input selectivities is not mutually consistent; and said adjusted selectivities are mutually consistent and yield a set of satisfiable constrained optimization problem.
 2. The method of claim 1, wherein said constrained optimization problem is a maximum entropy (ME) constrained optimization problem for estimating selectivities of conjunctive predicates.
 3. The method of claim 1, further comprising: weighting a first pair of slack variables with a first weighting coefficient, said first pair of slack variables associated with a constraint for a selectivity s_(x); weighting a second pair of slack variables with a second weighting coefficient, said second pair of slack variables associated with a constraint for a selectivity s_(y), wherein: selectivity s_(x) is more reliable than selectivity s_(y), and said first weighting coefficient is larger than said second weighting coefficient; and performing said adjusting and minimizing steps with said weighting coefficients.
 4. The method of claim 3, wherein: said minimizing step comprises solving a linear program (LP) having slack variables a_(x) ⁺; a_(x) ⁻, said LP including: ${\underset{{a_{X}}^{+},{a_{X}}^{-},x_{b}}{minimize}{\sum\limits_{X \in {T\backslash Ø}}\;{a_{X}}^{+}}} + {{a_{X}}^{-}\mspace{14mu}{subject}\mspace{14mu}{to}}$ ${{{\sum\limits_{b \in {C{(X)}}}\; x_{b}} + {a_{X}}^{+} - {a_{X}}^{-}} = s_{X}},{X \in {T\backslash\left\{ Ø \right\}}},{{\sum\limits_{b \in {\{{0,1}\}}^{n}}\; x_{b}} = 1}$ a_(X)⁺ ≥ 0, a_(X)⁻ ≥ 0  and  0 ≤ s_(X) − a_(X)⁺ + a_(X)⁻ ≤ 1; and said adjusting step includes replacing each s_(x) with s_(x)−a_(x) ⁺−a_(x) ⁻.
 5. A database query optimizer, wherein the optimizer is embodied in a software program for database management and executed by a processor, the optimizer receiving input of a knowledge set T for a predicate index set N and selectivities {S_(y):YεT} for predicate set P, the optimizer executing steps for: partitioning said predicate index set N with respect to said input selectivities; wherein a constant μ: is chosen such that the cardinality of each partition N₁, N₂, . . . , N_(k) follows |N_(i)|≦μ for each iε{1,2, . . . , k}; selecting, in response to said predicate index set₁₃ N not satisfying a partitioning condition, XεT such that X has the smallest impact on a maximum entropy (ME) solution to a constrained optimization problem removing the element X from T in order to force a partitioning of P and T; partitioning P and T accordingly in response to said predicate index set N satisfying a partitioning condition.
 6. The optimizer of claim 5, further comprising executing steps for: choosing a constant μ:; and removing elements from T in order to force a partitioning of P and T with partitions N₁, N₂, . . . , N_(k) in which |N_(i)|≦μ for each iε{1, 2, . . . , k}.
 7. The optimizer of claim 5, wherein said selecting XεT further comprises selecting X in descending order of Δ_(x), where Δ_(x)=s_(x|Π) _(iεx)s_(i).
 8. The optimizer of claim 5, wherein said forcing a partition further comprises: starting with |N| partitions, wherein N_(i)={i} and c_(i)=1 for every partition; iteratively merging partitions based on each element XεT, so that $N_{i} = \left\{ \begin{matrix} {\bigcup_{j|{X \subseteq N_{j}}}N_{j}} & {{{{if}\mspace{14mu} i} = {\min(X)}};} \\ Ø & {{otherwise},} \end{matrix} \right.$ adding an element XεT to the knowledge set T_(i) of partition N_(i) only if afterwards c_(i)≦μ is still satisfied and otherwise, if C_(i)≦μ is violated, ignoring X
 9. The optimizer of claim 5 wherein: said satisfied partitioning condition is that N={1,2, . . . , n} can be split into nonempty disjoint subsets N₁, N₂, . . . , N_(k), such that for each XεT it is the case that X ⊂N_(i) for some iε{1,2, . . . , k}; and said partitioning P and T accordingly comprises setting P_(i)={p_(j)|jεN_(i)} and T_(i)={XεT|X⊂N_(i)} for 1≦i≦k.
 10. A computer program product comprising a computer useable medium including a computer readable program to be loaded into a computer system with a processor, wherein the computer readable program when executed by the processor causes the computer to: input a set of input selectivities {s_(x): XεT}; partition a predicate index set N with respect to said input selectivities, wherein a number of said Dartitions is at least one, wherein the input selectivities can be single or multivariate; and wherein a constant μ: is chosen such that the cardinality of each partition N₁, N₂, . . . , N_(k) follows |N_(i)|≦μ for each iε{1,2, . . . , k}; form the |T| constraints Σ_(bεC(X))x_(b)=s_(x), XεT; wherein each partition set of selectivities is constrained separately and treated as a separate unit; detect a zero atom b for which x_(b)=0 in said constraints; and combining the zero atoms determined in the separate partitions using an independence assumption.
 11. The computer program product of claim 10, wherein said step of detecting further comprises: detecting said zero atom using an iterative process.
 12. The computer program product of claim 10, wherein said step of detecting further comprises: detecting said zero atom using an approximation process.
 13. The computer program product of claim 10, wherein said step of detecting further comprises: setting an initial set A₀ of candidate zero atoms to contain all of the atoms so that A₀={0, 1}^(n); in a first iteration, setting i=0 and solving the linear program (LP): $\begin{matrix} {{\underset{x_{b}|{b \in {\{{0,1}\}}^{n}}}{maximize}{\sum\limits_{b \in A_{i}}\;{x_{b}\mspace{14mu}{subject}\mspace{14mu}{to}\mspace{14mu}{the}\mspace{14mu}{T}\mspace{14mu}{constraints}}}}{{\sum\limits_{b \in {C{(X)}}}\; x_{b}} = s_{X}},\mspace{11mu}{X \in T},{{{and}\mspace{20mu}{to}\mspace{14mu} x_{b}} \in \left\lbrack {0,1} \right\rbrack},{{b \in \left\{ {0,1} \right\}^{n}};}} & \left( {{Equation}\mspace{20mu} 1} \right) \end{matrix}$ in a second iteration, refining the set of candidates as A₁=A₀\{b|x_(b)≠0}, setting i=1, and solving the resulting LP in Equation 1; and further iterating until either (i) A_(i)=Ø or (ii) x_(b)=0 for every bεA_(i).
 14. The computer program product of claim 10, wherein said step of detecting further comprises: rewriting the selectivity x_(b) of each atom as the sum of two new variables: x_(b)=V_(b)+W_(b); solving the following LP: $\underset{v_{b},w_{b}}{maximize}{\sum\limits_{b \in {\{{0,1}\}}^{n}}\;{v_{b}\mspace{14mu}{subject}\mspace{14mu}{to}\mspace{14mu}{the}\mspace{14mu}{T}\mspace{14mu}{constraints}}}$ ${{{\sum\limits_{b \in {C{(X)}}}\; v_{b}} + w_{b}} = s_{X}},{X \in T},{{{and}\mspace{14mu}{to}\mspace{14mu} 0} \leq w_{b} \leq {1\mspace{14mu}{and}\mspace{14mu} 0} \leq v_{b} \leq ɛ},{{b \in \left\{ {0,1} \right\}^{n}};}$ wherein ε is a chosen small value in [0, 1]; and detecting an atom b as a zero atom if and only if W_(b)=V_(b)=0. 